Characterizing Visitors to a Website Across Multiple Sessions

نویسندگان

  • Arindam Banerjee
  • Joydeep Ghosh
چکیده

Characterizing web users based on their interactions with a particular website is a key problem in web mining. Such interactions are reflected by clickstream data in weblogs, the content of the pages that they view, as well as their actions such as search queries typed in. In general a wide variety of behaviors is observed at popular websites. So, to make the problem more tractable, it is often wise to first group similar users into relatively homogeneous segments and then try to characterize each such segment. There has been much work done on clustering visitors at a site, but typically the analyses use substantially simplified descriptions of session behavior, and also do not characterize users based on their footprints across multiple visits to the site. In this paper, each web user is represented by a set of sessions, and a two stage methodology for grouping such users is proposed. In the first stage, sessions are clustered into conceptual session types based on both the trajectory taken through the website as well as the time spent at each page. The distribution of a user’s multiple visits across these session types forms the basis of user grouping in the second stage. Such clustering based on a rich user description forms a platform for further characterization. Results are presented using weblogs of a popular website to illustrate the techniques.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Monitoring Web Site Usage of e-Bug: A Hygiene and Antibiotic Awareness Resource for Children

BACKGROUND e-Bug is an educational resource which teaches children and young people about microbes, hygiene, infection, and prudent antibiotic use. The e-Bug resources are available in over 22 different languages and they are used widely across the globe. The resources can be accessed from the e-Bug website. OBJECTIVE The objective of this study was to analyze the usage of the e-Bug website i...

متن کامل

Using Google Analytics for measuring inlinks effectiveness

The aim of this brief communication is to develop a tracking methodology to analyse the effectiveness of inlink visits (return visit behaviour and length of sessions). In other words, how deep do inlink visitors navigate into the website? Do all inlinks perform the same? This paper addresses these questions by time series analysis of Google Analytics data, with a methodology developed by Plaza ...

متن کامل

Understanding Successive Searches Across Multiple Sessions Over the Web

This study intends to enhance the understanding of successive searches over multiple sessions by characterizing successive searches with a conceptual model, Multiple Information Seeking Episodes (MISE), validating MISE and supporting successive searches with a prototyped information system, PERsonalized and Successive Information Seeking Toolkit (PERSIST), whose requirements are derived from MI...

متن کامل

Probabilistic Deduplication of Anonymous Web Traffic

Cookies and log in-based authentication often provide incomplete data for stitching website visitors across multiple sources, necessitating probabilistic deduplication. We address this challenge by formulating the problem as a binary classification task for pairs of anonymous visitors. We compute visitor proximity vectors by converting categorical variables like IP addresses, product search key...

متن کامل

SkyServer Traffic Report - The First Five Years

The SkyServer is an Internet portal to the Sloan Digital Sky Survey Catalog Archive Server. From 2001 to 2006, there were a million visitors in 3 million sessions generating 170 million Web hits, 16 million ad-hoc SQL queries, and 65 million page views. The site currently averages 35 thousand visitors and 400 thousand sessions per month. The Web and SQL logs are public. We analyzed traffic and ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002